Design of a Extraction System for Definitional Contexts from Biomedical Corpora
نویسندگان
چکیده
In this paper we show a general advance about the desgin of a methodology for extracting definitional contexts from corpus of biomedicine in Spanish, taking into account a set of processes performed by the following modules: (i) a term extractor based in a hybrid method, (ii) a set of verbs that configure the syntactic structure of a definitional context, (iii) a chunker able to recognize those noun phrases that introduce a definition, considering the lexical relation of hyponymy/hypernymy, where the hyponym is the term defined, and the hypernym is the Genus Term which represents a conceptual category associated with such term.
منابع مشابه
Recognition and extraction of definitional contexts in Spanish for sketching a lexical network
In this paper we propose a method to exploit analytical definitions extracted from Spanish corpora, in order to build a lexical network based on the hyponymy/hyperonymy, part/whole and attribution relations. Our method considers the following steps: (a) the recognition and extraction of definitional contexts from specialized documents, (b) the identification of analytical definitions on these d...
متن کاملExtraction of Definitional Contexts using Lexical Relations
In this paper we present a method for automatically extracting definitional contexts from restricted domains in Spanish. Definitional contexts are textual fragments where there is an implicit definition that can be identified by taking into account verbal patterns linking a term and its corresponding definition. Our interest is in definitional contexts with analytical definitions. Therefore, we...
متن کاملاستخراج پیکره موازی از اسناد قابلمقایسه برای بهبود کیفیت ترجمه در سیستمهای ترجمه ماشینی
Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...
متن کاملECODE: A Definition Extraction System
Terminological work aims to identify knowledge about terms in specialised texts in order to compile dictionaries, glossaries or ontologies. Searching for definitions about the terms that terminographers intend to define is therefore an essential task. This search can be done in specialised corpus, where they usually appear in definitional contexts, i.e. text fragments where an author explicitly...
متن کاملExtracting information from textual documents in the electronic health record: a review of recent research.
OBJECTIVES We examine recent published research on the extraction of information from textual documents in the Electronic Health Record (EHR). METHODS Literature review of the research published after 1995, based on PubMed, conference proceedings, and the ACM Digital Library, as well as on relevant publications referenced in papers already included. RESULTS 174 publications were selected an...
متن کامل